-
Notifications
You must be signed in to change notification settings - Fork 42
Extend methods to generate Selection Schema for CVE and other projects #1002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds post-processing and serialization enhancements for SSVC selection models to produce cleaner, schema-ready dumps and JSON output with RFC3339 UTC timestamps.
- Introduces a _post_process step to normalize selection values and strip empty arrays.
- Overrides model_dump and model_dump_json to apply post-processing and format timestamps.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
ahouseholder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Neither the issue #996 nor the PR explain what specific problems this PR is intended to solve. The PR description says what changes are here, but not why.
- There are no tests for the newly added functionality. This would also help show what the expected behavior is (i.e., what's getting eliminated by _post_process()?) →
src/test/test_selections.py - I have inline comments/questions too
src/ssvc/selection.py
Outdated
| fix_selection(sel) for sel in data["selections"] if sel | ||
| ] | ||
| # Remove empty array fields from the top level | ||
| keys_to_delete = [k for k, v in data.items() if isinstance(v, list) and not v] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we eliminate the need for the keys_to_delete loop by using exclude_none=True or exclude_if = lambda: ... in the class definition?
src/ssvc/selection.py
Outdated
| """ | ||
| Ensures all Selection.values are lists and removes empty array elements. | ||
| """ | ||
| def fix_selection(selection): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what this is doing, but I don't understand why we need it? Is there an example of a Selection object that needs this treatment? How would that get created? If so, can we put that into a unit test to demonstrate the need for this method?
src/ssvc/selection.py
Outdated
| model_dump_kwargs = kwargs.copy() | ||
| json_kwargs = {} | ||
| # List of json.dumps kwargs you want to support | ||
| json_kwarg_names = ['indent', 'sort_keys', 'separators', 'ensure_ascii'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to be picky about this instead of just passing **kwargs along? (If we do it this way, then we're breaking the expectation that model_dump_json passes kwargs through.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeh will get rid of this for something simpler
return json.dumps(data, **{k: v for k, v in kwargs.items() if k in json.dumps.__code__.co_varnames})
src/ssvc/selection.py
Outdated
| data = self._post_process(data) | ||
| # Format timestamp as UTC RFC3339 string | ||
| if "timestamp" in data and isinstance(data["timestamp"], datetime): | ||
| ts = data["timestamp"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this in a field serializer instead? If so, I think this would let us avoid having to override model_dump_json() entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to standardize UTC conversion. I think it is not necessary. People may want to keep Timezone. I will get rid of it.
{
"timestamp": "2025-10-09T13:59:00.132847-04:00",
"schemaVersion": "2.0.0",
"target_ids": [],
"selections": [
{
"namespace": "ssvc",
"key": "E",
"version": "1.1.0",
"name": "Exploitation",
"values": [
{
"name": "Public PoC",
"key": "P"
}
]
}
],
"decision_point_resources": [],
"references": []
}after _post_process {
"timestamp": "2025-10-09T17:50:50Z",
"schemaVersion": "2.0.0",
"selections": [
{
"namespace": "ssvc",
"key": "E",
"version": "1.1.0",
"name": "Exploitation",
"values": [
{
"name": "Public PoC",
"key": "P"
},
{
"name": "Active",
"key": "A"
}
]
}
]
}
|
|
Ah, okay, I was totally missing the top-level items that could be empty lists, and was only thinking about |
ahouseholder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version looks good to me.
Added extension methods to support export of SSVC Selection data into schema supported format.
This pull request introduces enhancements to the serialization logic for selection models in
src/ssvc/selection.py. The changes ensure that dumped data is consistently formatted, with empty or invalid values removed and timestamps properly formatted in JSON outputs.Key improvements to serialization and data cleanup:
_post_processmethod to ensure allSelection.valuesare lists, remove empty or falsy items, and clean up empty array fields from the top-level dictionary.model_dumpandmodel_dump_jsonmethods to apply the post-processing step before returning data, and to format timestamps as UTC RFC3339 strings in JSON output.